235 research outputs found
Fast conformational clustering of extensive molecular dynamics simulation data
We present an unsupervised data processing workflow that is specifically
designed to obtain a fast conformational clustering of long molecular dynamics
simulation trajectories. In this approach we combine two dimensionality
reduction algorithms (cc\_analysis and encodermap) with a density-based spatial
clustering algorithm (HDBSCAN). The proposed scheme benefits from the strengths
of the three algorithms while avoiding most of the drawbacks of the individual
methods. Here the cc\_analysis algorithm is for the first time applied to
molecular simulation data. Encodermap complements cc\_analysis by providing an
efficient way to process and assign large amounts of data to clusters. The main
goal of the procedure is to maximize the number of assigned frames of a given
trajectory, while keeping a clear conformational identity of the clusters that
are found. In practice we achieve this by using an iterative clustering
approach and a tunable root-mean-square-deviation-based criterion in the final
cluster assignment. This allows to find clusters of different densities as well
as different degrees of structural identity. With the help of four test systems
we illustrate the capability and performance of this clustering workflow:
wild-type and thermostable mutant of the Trp-cage protein (TC5b and TC10b),
NTL9 and Protein B. Each of these systems poses individual challenges to the
scheme, which in total give a nice overview of the advantages, as well as
potential difficulties that can arise when using the proposed method
Carboplatin binding to a model protein in non-NaCl conditions to eliminate partial conversion to cisplatin, and the use of different criteria to choose the resolution limit
Hen egg white lysozyme (HEWL) co-crystallisation conditions of carboplatin
without sodium chloride (NaCl) have been utilised to eliminate partial
conversion of carboplatin to cisplatin observed previously. Tetragonal HEWL
crystals were successfully obtained in 65% MPD with 0.1M citric acid buffer at
pH 4.0 including DMSO. The X-ray diffraction data resolution to be used for the
model refinement was reviewed using several topical criteria together. The
CC1/2 criterion implemented in XDS led to data being significant to 2.0{\AA},
compared to the data only being able to be processed to 3.0{\AA} using the
Bruker software package (SAINT). Then using paired protein model refinements
and DPI values based on the FreeR value, the resolution limit was fine tuned to
be 2.3{\AA}. Interestingly this was compared with results from the EVAL
software package which gave a resolution limit of 2.2{\AA} solely using
crossing 2, but 2.8{\AA} based on the Rmerge values (60%). The
structural results showed that carboplatin bound to only the N{\delta} binding
site of His-15 one week after crystal growth, whereas five weeks after crystal
growth, two molecules of carboplatin are bound to the His-15 residue. In
summary several new results have emerged: - firstly non-NaCl conditions showed
a carboplatin molecule bound to His-15 of HEWL; secondly binding of one
molecule of carboplatin was seen after one week of crystal growth and two
molecules were bound after five weeks of crystal growth; and thirdly the use of
several criteria to determine the diffraction resolution limit led to the
successful use of data to higher resolution.Comment: 14 pages; submitted to Acta Cryst D Biological Crystallography
reference number tz504
Recommended from our members
Assessing and Maximizing Data Quality in Macromolecular Crystallography
The quality of macromolecular crystal structures depends, in part, on the quality and quantity of the data used to produce them. Here, we review recent shifts in our understanding of how to use data quality indicators to select a high resolution cutoff that leads to the best model, and of the potential to greatly increase data quality through the merging of multiple measurements from multiple passes of single crystals or from multiple crystals. Key factors supporting this shift are the introduction of more robust correlation coefficient based indicators of the precision of merged data sets as well as the recognition of the substantial useful information present in extensive amounts of data once considered too weak to be of value
A critical examination of the recently reported crystal structures of the human SMN protein.
A recent publication by Seng et al. in this journal reports the crystallographic structure of refolded, full-length SMN protein and two disease-relevant derivatives thereof. Here, we would like to suggest that at least two of the structures reported in that study are incorrect. We present evidence that one of the associated crystallographic datasets is derived from a crystal of the bacterial Sm-like protein Hfq and that a second dataset is derived from a crystal of the bacterial Gab protein. Both proteins are frequent contaminants of bacterially overexpressed proteins which might have been co-purified during metal affinity chromatography. A third structure presented in the Seng et al. paper cannot be examined further because neither the atomic coordinates, nor the diffraction intensities were made publicly available. The Tudor domain protein SMN has been shown to be a component of the SMN complex, which mediates the assembly of RNA-protein complexes of uridine-rich small nuclear ribonucleoproteins (UsnRNPs). Importantly, this activity is reduced in SMA patients, raising the possibility that the aetiology of SMA is linked to RNA metabolism. Structural studies on diverse components of the SMN complex, including fragments of SMN itself have contributed greatly to our understanding of the cellular UsnRNP assembly machinery. Yet full-length SMN has so far evaded structural elucidation. The Seng et al. study claimed to have closed this gap, but based on the results presented here, the only conclusion that can be drawn is that the Seng et al. study is largely invalid and should be retracted from the literature
Crystal structure of rhodopsin bound to arrestin by femtosecond X-ray laser.
G-protein-coupled receptors (GPCRs) signal primarily through G proteins or arrestins. Arrestin binding to GPCRs blocks G protein interaction and redirects signalling to numerous G-protein-independent pathways. Here we report the crystal structure of a constitutively active form of human rhodopsin bound to a pre-activated form of the mouse visual arrestin, determined by serial femtosecond X-ray laser crystallography. Together with extensive biochemical and mutagenesis data, the structure reveals an overall architecture of the rhodopsin-arrestin assembly in which rhodopsin uses distinct structural elements, including transmembrane helix 7 and helix 8, to recruit arrestin. Correspondingly, arrestin adopts the pre-activated conformation, with a ∼20° rotation between the amino and carboxy domains, which opens up a cleft in arrestin to accommodate a short helix formed by the second intracellular loop of rhodopsin. This structure provides a basis for understanding GPCR-mediated arrestin-biased signalling and demonstrates the power of X-ray lasers for advancing the frontiers of structural biology
Ternary structure reveals mechanism of a membrane diacylglycerol kinase
Diacylglycerol kinase catalyses the ATP-dependent conversion of diacylglycerol to phosphatidic acid in the plasma membrane of Escherichia coli. The small size of this integral membrane trimer, which has 121 residues per subunit, means that available protein must be used economically to craft three catalytic and substrate-binding sites centred about the membrane/cytosol interface. How nature has accomplished this extraordinary feat is revealed here in a crystal structure of the kinase captured as a ternary complex with bound lipid substrate and an ATP analogue. Residues, identified as essential for activity by mutagenesis, decorate the active site and are rationalized by the ternary structure. The g-phosphate of the ATP analogue is positioned for direct transfer to the primary hydroxyl of the lipid whose acyl chain is in the membrane. A catalytic mechanism for this unique enzyme is proposed. The active site architecture shows clear evidence of having arisen by convergen
Simulation of X-ray frames from macromolecular crystals using a ray-tracing approach
An algorithm is described which simulates a data set obtained from a protein crystal using the rotation method. The diffraction pattern of an ideal crystal is specified by the orientation of the crystal s cell axes with respect to a specified laboratory coordinate system, the distance between the crystal and the detector, the wavelength and the rotation range per frame. However, a realistic simulation of an experiment additionally requires at least a plausible physical model for crystal mosaicity and beam properties. To explore the physical basis of reflection shape and rocking-curve variation, the algorithm simulates the diffraction of a real crystal composed of mosaic blocks which is illuminated with a beam of given divergence and dispersion. Ray tracing for each reflection leads to reflection shapes and rocking curves that appear realistic. A program implementing the algorithm may be used to reproducibly generate data sets that model different physical aspects (imperfections) of the crystal and the experiment. Certain types of systematic errors of the experimental apparatus may also be simulated. Further applications include teaching and characterization of the properties of data-reduction algorithms
Dissecting random and systematic differences between noisy composite data sets
Composite data sets measured on different objects are usually affected by random errors, but may also be influenced by systematic (genuine) differences in the objects themselves, or the experimental conditions. If the individual measurements forming each data set are quantitative and approximately normally distributed, a correlation coefficient is often used to compare data sets. However, the relations between data sets are not obvious from the matrix of pairwise correlations since the numerical value of the correlation coefficient is lowered by both random and systematic differences between the data sets. This work presents a multidimensional scaling analysis of the pairwise correlation coefficients which places data sets into a unit sphere within low-dimensional space, at a position given by their CC* values [as defined by Karplus & Diederichs (2012), Science, 336, 1030-1033] in the radial direction and by their systematic differences in one or more angular directions. This dimensionality reduction can not only be used for classification purposes, but also to derive data-set relations on a continuous scale. Projecting the arrangement of data sets onto the subspace spanned by systematic differences (the surface of a unit sphere) allows, irrespective of the random-error levels, the identification of clusters of closely related data sets. The method gains power with increasing numbers of data sets. It is illustrated with an example from low signal-to-noise ratio image processing, and an application in macromolecular crystallography is shown, but the approach is completely general and thus should be widely applicable.publishe
- …